Improving Pronoun Translation for Statistical Machine Translation

نویسنده

  • Liane Guillou
چکیده

Machine Translation is a well–established field, yet the majority of current systems translate sentences in isolation, losing valuable contextual information from previously translated sentences in the discourse. One important type of contextual information concerns who or what a coreferring pronoun corefers to (i.e., its antecedent). Languages differ significantly in how they achieve coreference, and awareness of antecedents is important in choosing the correct pronoun. Disregarding a pronoun’s antecedent in translation can lead to inappropriate coreferring forms in the target text, seriously degrading a reader’s ability to understand it. This work assesses the extent to which source-language annotation of coreferring pronouns can improve English–Czech Statistical Machine Translation (SMT). As with previous attempts that use this method, the results show little improvement. This paper attempts to explain why and to provide insight into the factors affecting performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Pronoun Translation for Statistical Machine Translation (SMT)

Machine Translation is a well established field, yet the majority of current systems perform the translation of sentences in complete isolation, losing valuable contextual information from previously translated sentences in the discourse. One such class of contextual information concerns who or what it is that a reduced referring expression such as a pronoun is meant to refer to. The use of ina...

متن کامل

Improving Pronoun Translation by Modeling Coreference Uncertainty

Information about the antecedents of pronouns is considered essential to solve certain translation divergencies, such as those concerning the English pronoun it when translated into gendered languages, e.g. for French into il, elle, or several other options. However, no machine translation system using anaphora resolution has so far been able to outperform a phrase-based statistical MT baseline...

متن کامل

Zero Pronoun Resolution can Improve the Quality of J-E Translation

In Japanese, particularly, spoken Japanese, subjective, objective and possessive cases are very often omitted. Such Japanese sentences are often translated by Japanese-English statistical machine translation to the English sentence whose subjective, objective and possessive cases are omitted, and it causes to decrease the quality of translation. We performed experiments of J-E phrase based tran...

متن کامل

Translating Pronouns with Latent Anaphora Resolution

We discuss the translation of anaphoric pronouns in statistical machine translation from English into French. Pronoun translation requires resolving the antecedents of the pronouns in the input, a classic discourse processing problem that is usually approached through supervised learning from manually annotated data. We cast cross-lingual pronoun prediction as a classification task and present ...

متن کامل

ParCor 1.0: A Parallel Pronoun-Coreference Corpus to Support Statistical MT

We present ParCor, a parallel corpus of texts in which pronoun coreference – reduced coreference in which pronouns are used as referring expressions – has been annotated. The corpus is intended to be used both as a resource from which to learn systematic differences in pronoun use between languages and ultimately for developing and testing informed Statistical Machine Translation systems aimed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012